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ABSTRACT 


Sometimes it is extremely difficult to secure handwritten documents in the real 
world. While doing so, we may encounter many problems such as misplacing the 
documents, unavailability of access from anywhere, physical damage, etc. So, to 
keep the information secure, we convert that information into digital format to 
address all the above mentioned problems. The main aim of our application is to 
recognize hand written text and display it in digital text format. Image 
processing is very significant process for data analysis these days. In image 
processing, the visible text from the real world - as input- must be processed 
precisely in order to produce the same information - as output - with accuracy. 
To do this, the text present in the image must be recognized by the system 
accurately. The proposed system aims at achieving these results. The process 
goes in this way: The image which contains the handwritten text is fed to the 
system is passed into neural network which recognizes the handwritten text 
present in the image and displays it in the form of digital text. This can be used 
for many purposes such as copying the digital text for using it elsewhere, 
producing formal documents and can also be used as input for data processing. 
Using this process, we can store the information in a secure way, we can access 
the information from anywhere or at any time and there is no scope for physical 
damage as the information is in digital format. 
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Motivation 

Text recognition in images is an active research area which 
attempts to develop a computer application with the ability 
to automatically read the text from images. Nowadays there 
is a huge demand of storing the information available on 
paper documents in to a computer readable form for later 
use. One simple way to store information from these paper 
documents in to computer system is to first scan the 
documents and then store them as document. The challenges 
involved are: font characteristics of the characters in paper 
documents and quality of the images. There is a need of 
character recognition mechanisms to perform document 
image analysis which transforms documents in paper format 
to electronic format. In this paper, we have reviewed and 
analyzed different methods for text recognition from images. 
The objective of this review paper is to summarize the well- 
known methods for better understanding of the reader. 


and they did not even look natural. The first commercialized 
OCR of this generation was IBM 1418, which was designed to 
read a special IBM font, 407. The recognition method was 
template matching, which compares the character image 
with a library of prototype images for each character of each 
font. 

Proposed system 

Handwritten Text Recognition (HTR) system implemented 
with Tensor Flow (TF) and trained on the IAM off-line HTR 
dataset [2]. This Neural Network (NN) model recognizes the 
text contained in the images of segmented words. As these 
word-images are smaller than images of complete text-lines, 
the NN can be kept small and training on the CPU is feasible. 
3/4 of the words from the validation-set are correctly 
recognized and the character error rate is around 10%. 


Existing system 

Character recognition originated as early as 1870 when 
Carey invented the retina scanner, which is an image 
transmission system using photocells. It is used as an aid to 
the visually handicapped by the Russian scientist Tyurin in 
1900. However, the first generation machines appeared in 
the beginning of the 1960s with the development of the 
digital computers. It is the first time OCR was realized as a 
data processing application to the business world [Mantas, 
1986] [1]. The first generation machines are characterized 
by the "constrained" letter shapes which the OCRs can read. 
These symbols were specially designed for machine reading, 


Architecture 



Figure 1: Project Architecture 
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Figure2. Process flow 

Methodology 

This project is developed using Tesseract tess-2 module 
software which is a Computer vision API library [3] , the 
modelis pretrained with the dataset containingthe literals of 
the language, which are inturn compared to the input image 
file to produce the required output. 

Advantages of System 

1. Converting handwritten text to digital text. 

2. We can store it in our versatile itself. 

3. Copy the converted digital text. 

4. Share the converted digital text via mail, whatsapp, etc,. 

Improvements 

1. It can be trained more to get accurate results. 

2. It can be trained on multiple data sets to adapt to 
different languages. 

3. Text to speech feature can be added. 

Result and Analysis 

We have tested the performance of our proposed system on 
many samples of handwritten text. 

Here are few screenshots of the result 



Screenshot 1: output screen 


network and preprocessing of image using edge detection 
and normalization are the ideal choice for degraded noisy 
images. The method of training neural network with 
extracted features front sample images of each character has 
detection accuracy to a greater extent. The proposed 
methodology has produced good results for images 
containing handwritten text written in different styles, 
different size and alignment with varying background. The 
system is developed and evaluated for a set of sample images 
containing handwritten text [5]. We discussed a NN which is 
able to recognize text in images. The NN consists of 5 CNN 
and 2 RNN layers and outputs a character-probability matrix. 
This matrix is either used for CTC loss calculation or for CTC 
decoding. An implementation using TF is provided. 

References 

[1] R. Smith, “A Simple and Efficient Skew Detection 
Algorithm via T ext Row Accumulation", Proc. of the 3rd 
Int. Conf. on Document Analysis and Recognition (Vol. 
2), IEEE 1995, pp. 1145-1148. 

[2] S.V. Rice, F.R. Jenkins, T. A. Nartker, The Fourth Annual 
Test of OCR Accuracy, Technical Report 95-03, 
Information Science Research Institute, University of 
Nevada, Las Vegas, July 1995. 

[3] R.W. Smith, The Extraction and Recognition of Text 
from Multimedia Document Images, PhD Thesis, 
University of Bristol, November 1987. 

[4] Chirag I Patel, Ripal Patel. Palak Patel "Handwritten 
Character Recognition Using Neural Networks", 
International Journal of Scientific & Engineering 
Research Volume 2, Issue 5, May- 2011. 

[5] Kauleshwar Prasad, Devvrat C Nigam, 
AshmikaLakhotiya, Dheeren Umre "Character 
Recognition Using Neural Toolbox", International 
Journal of u- and e- Service, Science and Technology 
Vol. 6, No. 1, February, 2013. 


@ IJTSRD | Unique Paper ID - IJTSRD23508 | Volume - 3 | Issue - 3 | Mar-Apr 2019 


Page: 1827 
















